Moving away from the notebook

These Jupyter notebooks provide a nice code playground for looking at Python and testing individual ideas one by one but it is time to move away from this notebook and into an "actual program". You may have already been doing this with play.py file as we go along, but if not, it's time to open up a real program.

What we're going to do is video, and we're going to do it with face detection to start off with. Open up (in a text editor or code editor...) the file

start.py

This has the following code at the top of it (should look familiar...)


In [1]:
# these imports let you use opencv
import cv2 #opencv itself
import common #some useful opencv functions
import video # some video stuff
import numpy as np # matrix manipulations

For the purposes of the notebook (this page that you're reading now) we're going to have to do some notebook code in order to view images here


In [2]:
#the following are to do with this interactive notebook code
%matplotlib inline 
from matplotlib import pyplot as plt # this lets you draw inline pictures in the notebooks
import pylab # this allows you to control figure size 
pylab.rcParams['figure.figsize'] = (10.0, 8.0) # this controls figure size in the notebook

The other code that is in the file start.py opens the webcam (if the webcam works), then just puts the contents of the webcam in a window until someone presses escape, while counting up the frames (until someone presses escape).

Test this now. You should just have a really basic webcam viewer.

What we are going to do now is to edit this webcam viewer and introduce detections (from previous notebooks) and some video processing

Coding exercise 1

Make your webcam image greyscale, and view that in the webcam viewer

Coding exercise 2

Run a face detector on your grey webcam image, and draw the output as rectangles

Coding exercise 3: motion detection

Exercise 3 is a little more challenging and involves motion detection. We're going to do this the simplest way which is to keep a moving average of the colour in each frame, and then compare that to the current frame.

3a: setting up our constants

Methods often have parameters, and these can be set outside the main video loop. The parameters we're going to set are fbuffer which is a variable for the number of frames in our buffer. We'll start with 15, this means that if something is still for 15 frames it will disappear into the background. alpha is the update rate for our background model and is a function of fbuffer. difference_thresh is the amount of difference from the background model that we're willing to accept before saying "actually that object is moving".

The final thing we set up is our actual moving average model, which is a variable called "movingaverage" the same size as the images we're getting, initialised with the first image (img).

fbuffer=15
alpha=float(1.0/fbuffer) 
difference_thresh=10
movingaverage=np.float32(img)

Copy these into the start.py file, outside of the main video loop.

3b: looking at the moving average model

Inside the video loop, we want to take each frame and update our moving average model. This is just one line of code in OpenCV (the first of the lines below) but we also want to take a look at it so we'll convert the scale so it'll show nicely as an image, and we'll copy the result to output image.

    cv2.accumulateWeighted(img,movingaverage,alpha) 
    res=cv2.convertScaleAbs(movingaverage)
    # show the background model 
    output_image=res.copy()

Add these to the file (before the facedetect section - we don't want to incorporate the face rectangles into our background model!) and take a look at the background model.

Experiment with different values of fbuffer

3c: looking at the difference image

Take the absolute difference between the current image (img) and the scaled background image result (res): this is the simplest measure of motion at the pixel level. Try looking at that in the output window.

    difference_img = cv2.absdiff(res, img)
    output_image=difference_img.copy()

Again, experiment with different values of fbuffer to see what kinds of motion are detected.

3d: convert the difference image to a binary (moving or not-moving) decision

The difference image is in colour: first we'll convert it to greyscale in order to have it as a single channel. You know how to do this.

Once we have a greyscale difference image, we need to threshold it to get a motion mask.

    ret,motionmask = cv2.threshold(grey_difference_img,difference_thresh,255,cv2.THRESH_BINARY)
    output_image=motionmask.copy()

The threshold function takes the variable "difference_thresh" which we declared at the start in 3a... Experiment with different values of difference_thresh to find out what works best in this situation

Coding exercise 4

Combine the outputs of the background subtractor and the face detector in the visualisation window: the easiest way to do this is to make the motion mask a colour image and then draw the detected faces on that.

4a convert the colourspace of the motion mask

    motionmask_visualisation= cv2.cvtColor(motionmask, cv2.COLOR_GRAY2BGR)

is a way of creating a colour version of your motion mask (which will at the current time still be black and white, obviously, as you've not drawn any colour on it yet). You want to do this just after the motion mask is created.

4b copy the face detections from the input image to the motion mask

If you look at the image fundamentals tutorial you can see how to create a little image (call it faceimage) which is a subpart of another image, and then you know how to copy that little image to a named position in a bigger image. That's basically what you need to do now. You can do this in the facedetect loop where you are currently drawing blue squares - either as well as the squares, or instead of.

Coding exercise 5

There's probably some tiny foreground detections and some noise going on in your motion detection.

Try smoothing (noise reduction) with a Gaussian. Remember you'll need to smooth your image before you add it to the background model, and you'll need to build the model based upon the smoothed version (otherwise you'll be calculating the difference between the input and a Gaussian smoothed input).

Additional coding exercises

  1. Use connected components to find large foreground objects [ http://stackoverflow.com/questions/35854197/how-to-use-opencvs-connected-components-with-stats-in-python ]
  2. Take the connected components bounding box and the face bounding box and work out left-of and right-of relations
  3. Take a look at some of the trackers you can find in the OpenCV python samples. In particular look at Mosse tracker and Camshift tracker. You should be able to run these, although the initialisation might be an issue (you may need to click to initialise).

In [ ]: